
(1) GENERAL INFORMATION: 

(i) APPLICANT : Treco, Douglas A. 

Heartlein, Michael W. 
Hauge, Brian M. 
Selden, Richard F 

(ii) TITLE OF INVENTION: Protein Production and Delivery 
(iii) NUMBER OF SEQUENCES: 30 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Hamilton, Brook, Smith & Reynolds, P.c. 

(B) STREET: Two Militia Drive 

(C) CITY: Lexington 

(D) STATE : Massachusetts 

(E) COUNTRY: USA 

(F) ZIP: 02173 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08,406,030 

(B) FILING DATE: 17-MAR-1995 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/243,391 
<B) FILING DATE: 13-MAY-1994 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/985,586 

(B) FILING DATE: 03-DEC-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/911,533 

(B) FILING DATE: 10-JUL-1992 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/787,840 

(B) FILING DATE: 05-NOV-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 07/789,188 

(B) FILING DATE: 05-NOV-1991 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US93/11704 

(B) FILING DATE: 02-DEC-1993 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US92/09627 

(B) FILING DATE: 05-NOV-1992 
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(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Granahan, Patricia 

(B) REGISTRATION NUMBER: 32,227 

(C) REFERENCE/DOCKET NUMBER: TKT95-01 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617) 861-6240 

(B) TELEFAX: (617) 861-9540 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

AATTGCTCCT CGTGGTCATG CTTCT 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 
-» (A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
CTGTGAAGGA CATGGGAGTC A 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4488 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TCTAGAGTCA GGATGGCACT GAAGGTCTCT GGGGAAGGGA CGATGATGAG AGCCCGTCAG 



60 
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AAACCCTCCC 


CCCTTTCCTG 


/i TT\fy TV fT> TV P TV P 

GGTG A x AG AG 


TA AP APTPAGA 


ACTTCACGCC 


CGGGGCTCTT 


120 


TGCTCCCTAC 


r^TVPTAPPPTAP 


uOLLUuu X Viv* 


G A TG A G AG CC 


CCCAGACCTC 


CCXGAAGGGT 


180 




p ii p a a n Tfi p p 


AV*A X wWAw\*» X 


GTTCTGCCCT 


AAGGAGCCGC 


AGAGACAACC 


240 


/"» T\r>f" , f*7V/"'TP , P* 


l«l*url*l*Al*Alrx*< 




GGAGCAGAGA 


GACAAGAAGG 


CCCTACGCTC 


300 


AGAGAG ltilb 




P A ATT A GG AT 


GCCCAGGCAG 


GGCTTATGAA 


AAAGGAACAT 


360 


GGAAAGGAAC 


G 1 LLAwu X G 


P PPT A PITS A AH. 


PTTAAGAAAG 


AACGCTGGAG 


CCAGATGCTT 


420 


GGGTT CC AAT 


ppT/?P PTPP7A 
UU1 GGG X GoA 


PP A PTTPPTR 
LWiv« X X V-»w X A 


CCTGTGTGAC 


CTTGAATCAA 


ATCACATTAT 


480 


CCTACTGAGC 


/"•TDTt P fPTPPP 
CILAlillOLl' 


PPTTPTP.T A A 


&ATGGGCATC 


ATAATGTCAG 


TGCCTTCCTC 


540 


GGACTGGGCT 


G X GG X G Alii* A 


P P A fflfl/Z A Pifl 


PA ATGC AG AG 

Wf^^K X VJx**»V**iV* 


CATGCTCTCG 


GCACAGTGCC 


600 


/~*7V TV /"*^*PY^1"■ , P* 


7a a/lTflPT tat a 
nnvluUlniA 


A ATfiGCATCA 


TCTCACCAGG 


CCTATCTTGG 


GTTGRGTGGG 


660 


CTGtAbw 1 G 


pipp tv » tv p 

V- X tnnnunuv 


APAPTRPPAT 


TGGAGTCTGA 


GAAGCGGATC 


CTGGTAGGGC 


720 


GGTCCAGGG 1 


P P P TA 7A TP TA P A 
bbbAA X uAuA 


r2/2TP , Pm^STRA 


GGCCGGACTG 


AGCCAAAAGC 


AGCCCCTCCC 


780 


AGCTCTCCCA 


GTTTCCC X CG 


C PS/iP'P'P'PYS/SP" 


AGCGTGACCC 


CTCCTTGCTC 


CTTCCCCTTT 


840 


CTCACCGGGX 


PT7A P_P TAP TAT TA 

G x AGG AG A X A 


P» APiA APiPJTPiA 


GGCTAGAGCG 


CCAGCAGCGA 


GACTCGGCTC 


900 


GTGCGACGG C 


f*1V*^V* TA pptp* 
G X GGG AGG X G 


PTP*P*P'P*TPiTP , A 


GCAGCGCCAC GAAGTCTGGG 


ACGGGAGGAA 


960 


tv p r^i^^^r* tv 
GATIjUUL x GA 


PPHPTPTPH TA 

GGAG X G X GAA 


A PflPPCPTTT 
AwvOltAvO XXX 


GGTGGCCCAG 


CCTCAACCAC 


AACCCCGCTG 


1020 


frmpp /"t^tTv p pp 
X TGGGGAGGG 


GGG X AGGlrw X 




CCACGGGCCC 


GCTCCTCAGC 


GCCTGGCTCC 


1080 


CCGGGG 1 uUL 


T TA T TA 71 PTP PP. 
X AX nnul vjv-Ij 


ATflPTPPGGG 


TCCCGCGGAT 


ACACGAAGGA 


CAGGCCGCTC 


1140 


/•^ /-i /"irp^ ^pp pt 
GGG 1 GGGGG 1 


ppp ta ta r»TV2f" M P 




GSGGGGGGGT 


AAGAACACGG 


GCTTCAGCTG 


1200 


TV JPPPP TA TA 


a pi/spp a ctpp 


V A w O V>^V>«Va^ X 


CCAAGTGGCC 


CGGGACCTAG 


TATCGTGGCC 


1260 


PTPPPTPPPT 
C xGGG X UtL 1 




P.AP.P A AG ACT 


TACCCTGGGG 


GCAGGTCTGG 


CAGCAGTGTC 


1320 




GPGPGPiPTGP 


PPAPAGGCCG 


GGGTTGGGCA 


CTCTGGTTTG 


ATGTTCTTGC 


1380 


TV/'PTP TA PPP'P 




TGGT A P-GGPG 


ACCCCACTGA 


GGCTGCTCCC 


GGAAAAGGCG 


1440 


p p ta a iv ppp a A 


PTPI TA P.TPP A A 


GATGCCAACT 


GATGAGACCC 


CCCCAGGCAA 


GGATGTCCCG 


1500 


/"■» TA P TA P TP TA P P 

C AG AG 1 LAbU 


P 7A P PTPTP PP 


APTTAPAAGC 


TGCGTGACCC 


TAGACAAGCT 


ACTTCATCTC 


1560 


T C x GGGGG 1 G 


TA TA f2P f T , P*P , P* p P^ 


TPTP1G A A AAT 


GGGGATAATA 


ATACTCTCTA 


TCTAGCAAGG 


1620 


^**^^tv TV 

CTGCCATGAG 


AG X X AG AX GA 


p* p* ta nnn 2i ape 

<j Lr Av^vVar An^V» 


AAACGGAGTT 


GGCACAGAGC 


CTCACACAGA 


1680 


GTGGGCGATC 


AGTAACAGCA 


CCTAAGAATT 


GGAGGGGCTG 


ATTCCCCTTC 


CT CCACC Al» A 


1 7in 

X / fru 


AAAATATCCC 


CAACATCTGC 


CGACTGGGCT 


CCTTCTCAGC 


AGCTCCGAGT 


CCACTCCGAC 


1800 


GCCCGCGCGA 


CCCGGCCGTC 


CCCACCCGCC 


AGCCCGGGCC 


GGCCGCGGGG 


TGCACTCACC 


1860 


GCCTCGCAGG 


CCACAGCACG 


CAGCGCATCA 


CCCCGAATGG 


CTCCCCTAGG 


TCCGGGTGCC 


1920 




ACGTCTCGTC CAAGGCATAG ACCTTCCCGC 
GGAGGCGCTG CCCAGCTCGC GCCGTGTGCC 
CCAGGCACCG CGCCCTTCTG CCCCCGCCCA 
CTGCGCCCCG CGCCCCTCCT CCGGCTCGGC 
CCTCCCCCGG GCGCCCACCT ACCCTGCTGC 
GATGGGCAGC ACGGGGGCTC TCGGGCCGCG 
GGAGCAGCAG CGGGGCCGGC GGGGCCGGGA 
GGGAGGAGGG AGGGAGGAGG GCGCGGGAGC 
GGCGGAGGGC CGGGCCGGGG GCGGTGCGGC 
GGGCAGTGCC CGCGAGGGGC TCGTCGGGCG 
CGGGAGGAGC GGCCGGGAGG AGCGCGGGCG 
CTACTGCCCC GGGCGCCGGC TCCGGCCCGT 
GGGCCTCCTC CTCAGCAAAC GGGGCGGCGG 
GGGGGTCCGA CCCAGCAGCA GCGGCCCGGA 
GACCGGGCAG GGGAGGAGGG AGGGGCGGGA 
AGGGACCAGG GGGCGCGAAG AGGGGGAGGA 
CCACAGGGCG GCTGGACCAG GAGGTCGGTG 
GAGGGGCAGA GGCTTACCTG AGGCCTGGAC 
AAGGATAAAC TTGTCTTTAA AGATACACGT 
TGCCTGGGCC CACAGCGCCC CCCAAACCCT 
CCCTACCTTC TCTCTGAGGT CGCTCCTCWT 
AGAAACCGGC AGCATTCCCC CTTCTGTGGA 
TCCTCAGGCG CATTCGACAG TCCCCTCTTG 
CCCTCTGCCT CTCCAGCCCA CTCCCAGCCT 
CTCCCTGTCT CCTGTCTCTC CCTCCCACAC 
CCCTCCTTCC TAATCTTGGG AGACATCTCG 
GGCCACACTT CTCAGCAGAC ATGCCCATCC 
GAAGTTCTGG GGGACAGGGG GATGATGGGA 
AGAGACTGTG GGGAGACTTG GGACTGGGAA 
GGAAAAGGGG GGCCAGCAGG GWGGTATTTG 
GACAGGGACA CATGGGCCTG GTTATTCCTC 
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CGAAGTGCAG CCTGCGGGAC GGGCTTGGCT 1980 

GCCCCGGGGG CTGCCCGCGG GTCCCGGGTC 2040 

CCCTCCGGGC CGCCCGCCGC GCCGAGCCAC 2100 

TGACTCGCCC CGAGCCCGAC TCCCCGCCCG 2160 

CCGAACGGGC AGCGGCTCCT TCTCAGAACG 2220 

CGGGGCGGGA GCCGAGCAGC AGCAGCCCGA 2280 

GGGCHCGGCA TGACGCGAAC GGGACAGCTG 2340 

GGGCGGAGGG AGGGAGGCGG GAGTGCGGAG 2400 

GGGAGGGGGC CGGGGCCGGG GCCGGGGCCG 2460 

GCCGCAGAGT CGGCGCCGGG CCGGGCGGGG 2520 

GGCGGGCGCT GACCCGGGCC GTACGCGGCT 2580 

TTTATGCCCC GCGCCCGACG CCCCGGCCGG 2640 

CGGCGGCTCG GCGAGGGGCC GCTGAGCCCG 2700 

TCGCGGGTGG GGGAGGGGAG GGAGGGCTGG 2760 

GGGGAAGGGG GAGCGGGGGA GGGGGAGGGG 2820 

GAGGCGGCCC GGAGCCCCCG CTGCTGGCGG 2880 

TCCAGCCCAG GAAGGGAGCC TCAGGCTAGG 2940 

CGCTCTGTGA GCGAGGCCCG GTTCCGCCCG 3000 

ACAGGAAAGG TCCATCAGCC GATCTCCCCC 3060 

CACCACCCTC TCTCACTGCC TAGCCTGCCT 3120 

TCTTGTGTTA CCCAGRACAG GGACCTAGCC 3180 

GTGACAGTAT CTCCCTGTCA TTGTAACTTA 3240 

CTTTCTCACC CCCTTCCTTC ACCCAAGGGA 3300 

CCTTTCTCTT GGTTCCCTGG TCATGCCTGC 3360 

ACACCCACTA TCCTCCCAGC TATCCCAGCA 3420 

TCTGGCTGGA CGGGAAAATT CCAGGATCTA 3480 

T TGGGGAGG A GGAACAGGAG AGAGCCTGAG 3540 

TCAAGGTCAG GCCAGGAAGC CCCTGAGGAC 3600 

GAAAGCAAAG GAGCTAGAGC CAGGGCCAAA 3660 

CGGGGGAGGT CCAGCAG CTG TCTTTCCTAA 3720 

TTGTCACATG TGGAACGGTA GGAGATGGAA 3780 
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GACGGAGACA GAACAAGCAA AGGAGGGCCC TGGGCACAGA GGTCTGTGTG TGTAGCCATC 3840 

TAAGCCACTG GACCCCAGCA GACGAGCACC TAAGCTCAGG CTTAACCAGT GCACGTGTGC 3900 

GCACATACTG TGCCCCGCAC CTGACGTCCA CTCAACCCGT CCAAACCCTT TCCCCATAAC 3960 

ACCAACCCAT AACAGGAGAT TTCTCTCATG TGGGCAATAT CCGTGTTCCC ACTTCGAAAG 4020 

GGGGAATGAC AAGATAGGAC TCCCTAGGGG ATTACAGAAA GAAAAGCAGG AAAGCAAGCA 4080 

TCCTGTTGGA TTTCAGCAGC AGGTATGATG TCCAGGGAAA AGAAATTTGG ATAGCCAGGG 4140 

AGTGAAAACC CCACCAATCT TAAACAAGAC CTCTGTGCTT CTTCCCCAGC AACACAAATG 4200 

TCCTGCCAGA TTCCTCCTGG AAAAAACTTC TGCTCCTGTC CCCCTCCAGG TCCAGGTTGC 4260 

CCATGTCCAG GAAAAGATGG ATCCCCCTCA TCCAAATCTT CTCCGTGTGT GCTGTGGGTG 4320 

GAGTGAGTRG WARCCCTGGT CCAGGCAGGG VGCTCCAGGG AAGAGCAAGG CGTCACTTCC 4380 

GGGSGCCTTC ACCAGTGTCT GGTGGCTCCC TTCTCTGATT GGGCAGAAGT GGCCCAGGCA 4440 

GAGCGTATGA CCTGCTGCTG TGGAGGGGCT GTGCCCCACC GCCACATG 4488 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2455 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

TCTTCCTACC CATCTGCTCC CCAGAGGGCT GCCTGCTGTG CACTTGGGTC CTGGAGCCCT 60 

TCTCCACCCG GTGAGTGGCC AGCAGGGTGT GGGGTTATGT GAGGGTAGAA AGGACAGCAA 120 

AGAGAAATGG GCTCCCAGCT GGGGGAGGGG CAGGCAAACT GGAACCTACA GGCACTGACC 180 

TTTGTCGAGA AGAGTGTAGC CTTCCCAGAA TGGGAGGAGC AGGGCAGAGC AGGGGTAGGG 240 

GGTGGGGTGC TGKTTTCCTG AGGGACTGAT CACTTACTTG GTGGAATACA GCACAGCCCT 300 

GGCTGGCCCT AAGGAAAGGG GACATGAGCC CAGGGAGAAA ATAAGAGAGG GAGCTGCACT 360 

TAGGGCTTAG CAAACACAGT AGTAAGATGG ACACAGCCCC AATCCCCATT CTTAGCTGGT 420 

CATTCCTCGT TAGCTTAAGG TTCTGAATCT GGTG CTGGGG AAGCTGGGCC AGGCAAGCCA 480 

GGGCGCAAGG AGAGGGTAAT GGGAGGAGGG CCCACTCATG TTGACAGACC TACAGGAAAT 540 

CCCAATATTG AATCAGGTGC AAGCCTCTTT GCACAACTTG TGAAAGGAGG AGGAAGCCAT 600 

GTGGGGGGTC CTGTGAAGGA ACCGGAAGGG GTTCTGCCAA GGGGGCAGGG AGGCAGGTGT 660 

GAGCTATGAG ACAGATATGT TAGTGGGCGC CTAAGACAAG GTAAGCCCCT AAGGTGGGCA 720 
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TCACCCAGCA 


GGTGCCCGj. 1 


VfLl VjuvrUnbv^ 


X ww XXX Unuw 


AAGGAAGTCC 


CAGAACTGTT 


780 


AGCCCATCTC 


TTGGCCTCAG 


TA m TV TA mf* TV P» T 

A 1 AA X vjbAb 1 


tv TTTPTiP^lTi P 
A X X X wA wwxi w 


TTfiGAGTCCA 


GAGAAAAGCT 


840 


CCAGTGGCTT 


TATGTGTGGG 


GGTAGATAGG 


P* TATA AP* A ATA Pi 

bAAAuAn X nu 


nw x x nn x x x 


CTCCCATACC 


900 


GCCTTTTAAT 


CCTGACCTCT 


AGTGGTCCCA 


/•> mmTA p»TAP , P ,r P r P 
w X XAUAuU X X 


TCITGC AG TTC 

X W X w\^f%w X X w 


CCCTC CC C AG 


960 


CCCCACTCCC 


C ACCG C AG AA 


GTTACCCUTC 


TA TA P'TA'PTA 'I'TPiP* 

AAUAlnl XvU 


RfTCCTTTGC 
UWWwV xxx u\# 


CAGTTCCTCA 


1020 


CCCAGGCCCT 


GCATCCCATT 


TitrtriTi PTPfP 
TTCUAU ILIO 


•PTPTPP API/IP 
X X w X UwiVjOU 


TGAAGCCACA 


ATACTTTCCT 


1080 


TCTCTATCCC 


^•tv m^^^iv ^ ta m 
CATCCCAGAl 


lllUlUl wAl» 


PTAAPAAPCA 


AGGTTGCTCA 


GAATTTAAGG 


1140 


nm« TV mm TV TV P» TV 

UTAATTAAGA 


m ta mp m^ m.f* m TA 




GTCCTGCTGC 


TCTCAGCAGG 


GGTAGGTGGC 


1200 


ACCAAATCCA 


m^m^^vi TA Hill*/** 

TGTCCGATTC 


TA C**T*C* TA P* P* AP* m 

AO 1 w Awl» Aw 1 


fHTfl APT! 71 A 71 


AGGAGACACC 


ATATGCTTTC 


1260 


TTGCTTTCTT 


TCTTTCTTTC 


TTTCTTTCTT 


XXXXXXXXXX 


GAGACGGAGT 


TTCACTCTTA 


1320 


TTGCCCAGGC 


TGGAGTGCAA 


m^r* m^ p*p» ta TP* 


TPPP PTPH PP 


APAACPTCCG 


CCTCCCAGGT 


1380 


ACAAGCGATT 


CTCCTGTCTC 


TV ^ ^n/ tni^^ir* TA TV 

AGCCTCCGAA 


P TH P P1*TPP 11 


X X Aw-XlVV w*T* X 


GAACCACCAC 


1440 


ACCCTGCTAG 


TTTTTTTGTA 


TTTCGTAGAG 


ppppppipfTP 


AWWAlui xnw 


TGAGGCTGGT 


1500 


GGCGAACTCC 


TGACCTCAGG 


m« TA *tVP*TA p/"»p 


i» w W» X X uunv X 


PPPAAAGTGC 

wl/V«nAAU X w w 


TGGGATTACA 


1560 


GGCATGAGCC ■ 


ACTGCACCCG 


GCAUALUA X A 


XwVXX lUilL 


AP A AGP A A AT 
AUAnvivnnni 


GTGAGAGAAT 


1620 


TCAGGGC x TT 


GGCAG X 1 CCA 


P*/"* r**vf* P* TP 71 Pi 
w\xw- X wO X l—Aw 


PaTPTPH AflP 


PPTCCCCAGC 


ATCTGTTCAC 


1680 


A/1111^ /"t^l TV /** /"* ^ 

CCTGCCAGGC 


TA P» m/Mii/tinni/^p 

AGTCTC iTUU 


TTA P* TA TA TV P»mm/^ 

X AuAAAV* IJltf 


PTT* tv tv TfiTT 
Vl lAAAlwl X 


CACTCTTCTT 


GCTACTTTCA 


1740 


^*^*TAmTV P* T\mmf» 

wATACAl 1 




W X UV»V7V^Vr XXX. 


VJvvvv/iwvv ^ 


ACTCTGCCCA 


GAAGTGCAAG 


1800 


ivp»p»r*mTA ta p*p*/"« 
AGCUTAAGCC 


p* p•P'*T 1 p*p , ta T^tir* 


P*P*P*P»TAPI/27A 7\P* 


vnx 1 wivjwjvj 


AGAGGCCCCA 

XlW4lW W W 1 


AACAGGGAGC 


1860 


/■»tv r*/~* tv p* p*p» 


ta p* ta p»ta p , p , p , p»p* 
AfjAUAULLLVj 


P»P»P»7AP* A TATP1/2 

wl^UAwAA X ww 


TAPiP»TC ACTGG 


TGAGAACACA 


CCTGAGGGGC 


1920 


TAGGGCCATA 


mf*/"* TV TV TV TV fl^P* 

T GGAAACATU 


AUAuAAuwb 


A^AAAAAAAn 


GAGACACGCT 


GCAGGGGGCA 


1980 


GGAAGCTGGG 


^ tv tv /t/mik mm 

GGAACCCATT 


P*m,PiP»P»TA TA TV TV TA 

C X LUL.AAAAA 


m* TA Pi/lPlP"PP r P 
X AAlvwiaivX wX 


Pi ta (IfZCZCZ TCGA 


TT CCCTGGGT 


2040 


TTCAGGTCTG 


/■^ rr> /"*#ti^ TV TV m 

GGTCCTG AA 1 


f~*r*r* ta TA mm P , /^ m 


Pi/1 A A T A PP A ri 
WnA X AUV/AU 


PTGAPAATGA 


TTTCCTCCTC 


2100 


ATCTTTCAAC 


U 1 LAUL 1 w-X w- 


/-»m^»7A TPTTi APi 


A ATTRCTCCT 

AAX XwwXwwX 


CGTGGT CATG 


CTTCTCCTAA 


2160 




tv ta pvp , tp* , ppp 


AfSPPPflCSCTC 


CTCCTGCTTG 


TGACCTCCGA 


GTCCTCAGTA 


2220 


AACTGCTTCG 


TGACTCCCAT 


GTCCTTCACA 


GCAGACTGGT 


GAGAACTCCC 


AACATTATCC 


2280 


CCTTTATCCG 


CGTAACTGGT 


AAGACACCCA 


TACTCCCAGG 


AAGACACCAT 


CACTTCCTCT 


2340 


AACTCCTTGA 


C CC AATG ACT 


ATTCTTCCCA 


TATTGTCCCC 


ACCTACTGAT 


CACACTCTCT 


2400 


GACAAGGATT 


ATTCTTCACA 


ATACAGCCCG 


CATTTAAAAG 


CTCTCGTCTA 


GAACT 


2455 
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(2) INFORMATION FOR SEQ ID NO: 5: 



(i) 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) 



MOLECULE TYPE: DNA (genomic) 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



TTTTGCGGCC GCTCGAGGAC ATTGATTATT GACTAGT 



37 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTTTGCGGCC GCCGGTACTT ACGTCACTCT TGGCAC 36 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TTTTCTCGAG GACATTGATT ATTGACTAGT 30 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CATGGGTCTT TTCTGCAGTC ACCGTCCTTG CTACCCATCT GCTCCCCAGA GGGCTGCCTG 60 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CAGGCAGCCC TCTGGGGAGC AGATGGGTAG CAAGGACGGT GACTGCAGAA AAGACCCATG 60 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
3 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
TTTTGGGCCC TCCTCCCATT ACCCTCT 27 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
TTTTCTCGAG GACATTGATT ATTGACTAGT 



30 
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(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CGCGGATTCC CCGTGCCAAG CCTAGCGGCA ATGGCTACAG GTGAGAACAC ACCTGAGGGG 60 
CTAGGGCCA 69 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TGGCCCTAGC CCCTCAGGTG TGTTCTCACC TGTAGCCATT GCCGCTAGGC TTGGCACGGG 60 
GAATCCGCG 69 
(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTTGAATTC CCATTCAGGA CCCAGACCTG AAACCCAGGG AATCC 



45 
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(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
TGCCTTGAAG TGCTTCTTCA 20 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
CCTCAGAGAT GACGAGAATG C 21 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4042 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GTCAACCTTC ACAGTAATTG CTTGTTCAGT GACTGCCACA ACCCAGCCTG GCAGAGAGAG 60 

GGAAGATACC CTATAAAGCA AGGTAACGTT AATGTTGAGA CCATGAATGG CCTTGAGCAG 120 

AGCAGAGTAT CATTGCTTCC TTCAAAATTC AGAAGGATCT GATGGTGCTC TGTGAGTTCA 180 

TGGGGGTGCC TCCGTGCAGG TTGAAACCAC AGCTGTCGTC CTTCCGCTTT CCCTCTTGAT 240 

CAGTAGAAGG GTACCCTCCC TGGCCTGCAC GTCGCTGGGT CACACAACAC TGGCTGTCGT 300 

TGCACAAAGC CACGGCCACC AGCGTTCCTT TGAGGCCATT TGTTTCCAGC CATGGTGCCT 360 

ATAGGATTTT TCCTTTATCC TGTAATTTCA GCCAAATCAG AGCATGTGAC CTGGCTTAGA 420 




TGTCAATATA ATTGTTGTTA TGTGCTCTTT 
TAACCTGAGA AGGCTGCAGA TCCTCGGGGG 
GAAGGCGGTC AGCTTTTCTC CTCGTTG CCG 
TGAAAATCCC AGAAGGGCTG GGCTTCCTTC 
GGGCCCAGCA TGGGAGGATT GTACCCCACT 
GCTCGACAGC ACCCATGGAA TGTGGGCAGA 
TAGGGCGGCA CGTGTTCTGC TTGTGCCCTG 
GGGTGCCCAG GGAGCTGCAG TCTCTCCAGC 
GCCACCCCAG CAGACCTGGC AGTGTGAGAG 
TGGCTGTTAC ATGGCAGCAT TGACTGACAC 
GAGAGTGCTG GAGACTCCAA CAAGCCACAG 
TGAATGATTG TTCTGGGAAT CTATCAGAGG 
CCAACAGTGA TCCCAGACGG GCCCCATGTC 
CTCACCAAAG CCCGTCCTGA GGGCAGCCAC 
CGAGGCCCAA GTTCCAGCCT TCCTTCTGGC 
GAGTTCCTGA AAGCAGATGG GGCAGCATTT 
GAGGGGAACC CTCGTCCCAC GTGCTGAGCA 
GCTATAATTG GTGTCCCTGT GCCCCGCCGG 
ACAGTGGCCG CCCTCTAGCT TTACTCCCTT 
CCAGGAAAGG CAAACACCAA AGGCAGAGGA 
TTGCTTGGGG GCTGGGTTTT GACGTGCTGG 
GGGAAGAACC ATTGCTGAAA CCTTTGGAAA 
GTGGTGGTGT TTCCATCTGG TAGACGCCGT 
TGCTGCCAGC CAGAGGCGTC TGTTGGCGTG 
AGGGTGGTTT ACCTTCCTGT TTCTAGTCCC 
TGGCCAGACC GAG C ACTTTC CTGACTTTCG 
GGCCTCCTGC AGACCCCATT TGTATTCATT 
GGACTTCTGC CAACCGTTCC AGGCCCTCCT 
GCTGTCCTTG GAGGGCCAGC ACAGCCCCTT 
AAG AT AG CTG TGCCCTAGCC CTGGAACCTC 
TTGCAGATGT GAGTAAAGGC TGTTGAGATA 
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TCCCTTCCTG TGTCTGTGAC AGGTTTAATT 480 

TTGGTGTAAA AACACCTCAT CCTGATCTGA 540 

TTGGCTGCCA GCACCCATTC TCTGTGGATG 600 

TTGGCATTCC CCAGGCCTAT CTCCAGAGTG 660 

CACTCCCCTG ATGTGGGGCT TGGACCTACA 720 

AGCGACAGCA GCCAACGTCC GCCTTGGCCT 780 

GGAGCCTCCA CCTTCCACAC TGTGGGAAGA 840 

CCAGCCCCAG GACGAGGCCC AGGCAGCAGA 900 

AAATGCATGT GTATACACTG AGTTTGCAGG 960 

AGACAGAAAA GAGATCCACG AGGGAGAAGT 1020 

GCTGCAGGGG CAGGATGGCT TCTTAGAAGG 1080 

AAGACATAGA GGCT CC AG AC GGTTGAAGGC 1140 

AGACCAGGCT CCTCCAGGGC TGTCGCTGCC 1200 

ACAGCAGGCA GCACTCGCCA TTTGTACAAG 1260 

AGGTAGAGGA AGCAGGGGCA CTATGCCXGG 1320 

GGTCAAGAGC CAGGAGGGGA TGACAGACCA 1380 

CACGTAGGGG GTTGGGCACT TGCTCTGTGA 1440 

AAGCTGCACC AGGCAGTTTC TTGGTGGAGG 1500 

CCCCGTGATG GGTCGCTGTC AGATGTGTGT 1560 

CTAGTCCCTA CACCGAATAC TCCGGTGGCC 1620 

AGGCTGTCCT AGACTTAGAG ATTAAAAACA 1680 

AGCCTGCAAT GGCCTCTGGC AGCCTGAGGA 1740 

CTCAATAGGA GGGACAGATG AGTGCACCAG 1800 

TCTTTATGGA ATGGGGTGCC AGTCTTGTGG 1860 

CACTGGGCCT GCCTTCTGCT TCATGCCAGC 1920 

ACCTTGGCCC CTGCTGACTC TTGCCGTTGA 1980 

TCCTGCAGTT CTC AT AC CTG AATCCCGCCT 2040 

CCCAGGGGGA CCACAGATGC TACGTGCAGG 2100 

CCAAGTGGGC AAGACCCAGG GGTGGCTCAA 2160 

TGAATGTTGA TTTTTGTAGC AAAAAAGGAC 2220 

AGGACATCCT CCCTGCTCTC TGGGAGGACC 2280 




-75- 

CCAAATGCAG GTGCACAGAT CTTAAGAAGA AGAGGCAGAG ACTGGGGTGA TGCAGCCACA 2340 

ACTAAGGAAA GCCAAGGATT GCTGGCAGCC TGCAGAAACT GGAGGGCAAG GAGCATCCCC 2400 

CAACCGCCCG GAGCCTCCAG GAGGCGCAAG GTCCTACTGA CTCCCTGACT TCAGACGTCC 2460 

AGTCTCCGGA ATTTTGAGAG GATCCATTTC TGTTATTTTA AGCAACCAAA CTTGTGGTAG 2520 

TTTCACCAGT CTCAGGAAAT GAATACGAAT GGAAAGTCAA AGATTCCAAG AAATGAGTGG 2580 

CGGGGTGCGG TGGCTCACAC TTGTAATCCC AGCATTTGCG GGAAGATTGC TTGGGCTCAG 2640 

GACTTGGAGA CCTTGTGTCT GTGAGAAACT TAAAAAATAG GCTGGGTGCG ATCGTCACGC 2700 

CTGTAATCCC AGCACTTTGG GAGGCCGAGG CAGGCGGATC ACAAGGTCAC GAGTTTGAGA 2760 

CCAGTGTGAC CAACATGGTG AAACCCTGTC TCTACTAAAA ATACAAAAAT TAGCCGGGTG 2820 

TGGTGGTGCG TGCCTGTAAT CCCAGCTACT CGGGAGGCTG AGGCAGAAGA ATTGCTTGAA 2880 

CCCAGGAAGC AGAGGTTGCA GTGAGCCGAG ATAGTATTAC TGCACTCCAG GCTGGGCAGC 2940 

AGAGCAAGAT TCCGCCTCAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA CTGAGCATGG 3000 

TAGCATGCAC CTGTGGTCCT CGTACGCCGG AGGATTGCCT GAAGCCAGGA GTTCAAGACC 3060 

AGTCTGGACA AAAGAGCAAG ACCCCATCTC TACCAAAAAA ATTTAAAAAT TAGCCAGGCA 3120 

TGGTGCCGTA CCCATAGTCT TAGCTACTCA GGAGGCTGAG GAGGGAGGAT TATCTGAGCC 3180 

TGGCGGTTGA GGCTATAATG AGCCATGATT TGGCCACTGC ACTCCAGCCT TGGCAACACA 3240 

GTGTGAGACC CTGTCTCAAA AACAATAAAA ACCCAAAACA AAAGAACCAA GAAATTACTG 3300 

GACCTGAGCC TGGCCTTTAG CTGCTGCCCT GCCCTKTGAC TGGTCACTCG GATCCCTGGG 3360 

CCTAAACACA CAGCCTATTG TCTACCTCAA GAAGGCTCCC CACTGCTTGG CTGGCAATTG 3420 

GGGTGGCTTT GCAGGCCCCA CCTGTCCTGG CCCCACGGCG CTGGTGCTGC AGGCCCCCAC 3480 

CACTGCTTGT TCCGAGCTCC CCAGCCTCCT GCAGAGTTGC CTGCACCTGA TGGCGATGAA 3540 

TCAGGAAGGC AGGCGTGTCC TGGGCCACAG AGCAGTCATG CTGTCAGCCA CCAGGGGGCT 3600 

CCATTTGCAA CTTTGGATGT GGCTTTGGCC TCTTTGTCCA AAGTGACCTT GGGGCCCCCA 3660 

GACAAGAGAC AGGGAGACTG GAGCCCAGCC CCACCCTCCC GCACATACCT GGCCCATCCC 3720 

TGCCCTATCC TGGAAGATGG GGGCCACCAC ACGTRCAAGG GACACGGGAT AGGAACCTTT 3780 

GGCCTTGTTA TCAGACATTT TAAAACTAAG TGCAAACGTG ATTATCAGGT GCAGTTTTTA 3840 

CAGCAGCAAG AAACCTGTGC TTACAGAAAG AAACACGTGC TAGCAACCCA CCTATGCGGA 3900 

AAGCCACACA GAGCCATTGT TTTCTGCACT CTCAGGTGAC GGCTCACATT TGCCCCAGGG 3960 

AAGGTCACAG CTGCCTGAAC TTTTAAAACT CCCAGACACG CACTGCCTGT GCAGGATCCG 4020 

GAGCCCAGCA GCACTGCCAG GG 4042 
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(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 810 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 471.. 810 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

CCTTGAAGTG CTTCTTCAGA GACCTTTCTT CATAGACTAC TTTTTTTTCT TTAAGCAGCA 60 

AAAGGAGAAA ATTGTCATCA AAGGATATTC CAGATTCTTG ACAGCATTCT CGTCATCTCT 120 

GAGGACATCA CCATCATCTC AGGTGAGCAC CAGGTGGAGT GCCTCTGGGT GACTGGCCGG 180 

TTTGGAGCAG GGAGGGAGGC TTAGAGTCTC ATCCTCCAGC AGCGAGTGAG GCGGAGGCTC 240 

CAGCGTCCTC CCGGGCGGGT TTTCTGGTGG ATGGAGGAGT GACTCGGGGT CCTCTACGTG 300 

GTGCCAGCTG TTTGGCTTTC TGGACGTTGT AGGAAAGGGT TTCCCCCGCC TGCGTCCCCC 360 

TGACCTTGAG CTCCACCAGC CCCTGCCAGC TGGGCTCCAG AAGGCTGGAG TGCTGTGGCA 420 

GGGATGACGT CTCACTTCTG TTATGTCTCT GTGCCCTGTG CTCTCCCAGG ATG AGG 476 

Met Arg 
1 

GGC ATG AAG CTG CTG GGG GCG CTG CTG GCA CTG GCG GCC CTA CTG CAG 524 
Gly Met Lys Leu Leu Gly Ala Leu Leu Ala Leu Ala Ala Leu Leu Gin 
5 10 15 

GGG GCC GTG TCC CTG AAG ATC GCA GCC TTC AAC ATC CAG ACA TTT GGG 572 
Gly Ala Val Ser Leu Lys He Ala Ala Phe Asn He Gin Thr Phe Gly 
20 25 30 

GAG ACC AAG ATG TCC AAT GCC ACC CTC GTC AGC TAC ATT GTG CAG ATC 620 
Glu Thr Lys Met Ser Asn Ala Thr Leu Val Ser Tyr He Val Gin He 
35 40 45 50 

CTG AGC CGC TAT GAC ATC GCC CTG GTC CAG GAG GTC AGA GAC AGC CAC 668 
Leu Ser Arg Tyr Asp He Ala Leu Val Gin Glu Val Arg Asp Ser His 
55 60 65 

CTG ACT GCC GTG GGG AAG CTG CTG GAC AAC CTC AAT CAG GAT GCA CCA 716 
Leu Thr Ala Val Gly Lys Leu Leu Asp Asn Leu Asn Gin Asp Ala Pro 
70 75 80 

GAC ACC TAT CAC TAC GTG GTC AGT GAG CCA CTG GGA CGG AAC AGC TAT 764 
Asp Thr Tyr His Tyr Val Val Ser Glu Pro Leu Gly Arg Asn Ser Tyr 
85 90 95 
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AAG GAG CGC TAC CTG 
Lys Glu Arg Tyr Leu 
100 



TTC GTG TAC AGG CCT GAC CAG GTG TCT GCG G 
Phe Val Tyr Arg Pro Asp Gin Val Ser Ala 



105 110 



810 



(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

( B ) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 
GACATTGATT ATTGACTAGT T 21 
(2) INFORMATION FOR SEQ ID NO: 20: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 
TTTAAGCTTC TGCAGAAAAG ACCCATGGAA AG 32 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
TGCTCTGGCA CAACAGGTAG 20 
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(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CATAGATGGT CAATGCGGC 19 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

AGCTTCTGCT TTAGGAAAGT AGAAAAATAA GAGCAAATTA AATCCAAGGT AAGTAAAAAA 60 

AAAAAAAAAA AAAAAAAGAA ATAAAAATTA GAGCAGAAAT CAATAAAATT GAAGACAGTA 120 

AATCAATAAA GAAAATCAAC ATAAAAAGTC TGGTTCTTGA AAAGATATAT AAAATTGATA 180 

AGCATCTACC TAGGATAATT AAGGAAAAAA GACAGAGGAC ACAGATTACT AATATCAAAC 240 

ATAAAAGCGG GAACATCACT GCAAATTTTA TAGGCATTGA AAGCGTAATA AAAGAATACT 300 

ATAAACTATT CTATAACTAC AAATTTGATA AGTAAATAGA ATGAACCAAT TCCTTGAAAG 360 

ACATAATCTG AAAAATGTAA AAAGAAGAAA TAAACAATCT GAATAGCCTA TATCTATTAA 420 

ATAAATTGAA TCAGTAATTA ATAACCTCTC AAAACAGGAA GCACAATGCC CAGATGGGTT 480 

CACTAGTGAA TTCTATCAAA TATTTAAAGA AAAAAAAATT GTATCAACTT TCTACAATCT 540 

CTTTCAGAAG ACAGAAGCAG AGGGAATACT TCCTAAATCA TTCAACTAGG CCAGCATTAC 600 

CTTAATACCG GAACTAGAAA ATGACATTAC AAGAAAAGAA AACAACAGAC CAATATCTCT 660 

CATGAACAAA GATACAAACA TTTTCAACAA AATATTAGCA AAAAGAATCC AAGAATGTAT 720 

CAAAAAATAT ACACCACAAC CAAGTAGAAT TTATTCCAGA TATGTAAGGG TGGTTCAACG 780 

TTTGAAAATC AATTAACGTA ATTTGTCCCA TCAACAGGTT AAAGAAGAAA ATCACATGGT 840 

CATATTGATA GACACAGAAA AAGCATTTGA CAAAATTTAA CACCCATTCA TGATGCAATC 900 
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TCTCAGTAAA CTAGGAATAG AGGAAAACTT CCTCAGCTTG AATGTACCTT CCTCTCAATT 960 

TTGCTATGAA CCTGAAACTC CTCTTAAAAA ATAAAGTTTT TCATTTAAAA AGAAAACAAA 1020 

AAACATGGAG GAGCGTTGAT GTATCTCATT TTAGACCAAT CAGCTATGGA TAGTTAGGCG 1080 

ACAGCACAGA TAGCTGCTGT ACTTCTGTTT CTGGCAATGT TCCAGACTAC ATTTAAAAAA 1140 

TTTTTAATTA TAGACTTGTA CTTAATGTTC AAGAAAAATA TGAAAATGCT TTGCCGTGTT 1200 

AATGCTACTC TTTTTTAAAA AAAACTAAAG TTCAAACTTT ATTTATATTT CATTAGTTTT 1260 

TTAGCTACTG TTCTTTTTCT GTTCTGGGAT CTCATTCAGA ATGCCACATT ACATATAATT 1320 

CTCATGTCTC CTTGGGTTCC TCTTAGTTTT GACAGTTCCT CAGACTTTTC TTATTTTTGA 1380 

TGACCTTGAC AGTTTTGAGG AGTACTGGTT AGATATAGGG TAATGGTTTT TAAAGTATAT 1440 

TTGTCATGAT TTATACTGGG TAAGGGTTTG GGAGGAAGCC ATGGGTAAGT ACTGTTCTCA 1500 

TCACATCATA TCAAGTTATA TACCATCAAT ATTGCCACAG ATGTTACTTA GCCTTTTAAT 1560 

ATTTCTCTAA TTTAGTGTAT ATGCAATGAT AGTTCTCTGA TTTCTGAGAT TGAGTTTCTC 1620 

ATGTGTAATG ATTATTTAGA GTTTCTCTTT CATCTGTTCA AATTTTGTCT AGTTTTATTT 1680 

TTTACTGATT TGTAAGACTT C TTTTT A TAA TCTGCATATT ACAATTCTCT TTACTGGGGG 1740 

TGTTGCAAAT ATTTTCTGTC ATTCTATGGC CTGACTTTTC TTAATGGTTT TTTAATTTTA 1800 

AAAATAAGTC TTAATATTCA TGCAATCTAA TTAACAATCT TTTCTTTGTG GTTAGGACTT 1860 

TGAGTCATAA GAAATTTTTC TCTACACTGA AGTCATGATG GCATGCTTCT ATATTATTTT 1920 

CTAAAAGATT TAAAGTTTTG CCTTCTCCAT TTAGACTTAT AATTCACTGG AATTTTTTTG 1980 

TGTGTATGGT ATGACATATG GGTTCCCTTT TATTTTTTAC ATATAAATAT ATTTCCCTGT 2040 

TTTTCTAAAA AAGAAAAAGA TCATCATTTT CCCATTGTAA AATGCCATAT TTTTTTCATA 2100 

GGTCACTTAC ATATATCAAT GGGTCTGTTT CTGAGCTCTA CTCTATTTAT CAGCCTCACT 2160 

GTCTATCCCC ACACATCTCA TGCTTTGCTC TAAATCTTGA TATTTAGTGG AACATTCTTT 2220 

CCCATTTTGT TCTACAAGAA TATTTTTGTT ATTGTCTTTT GGGCTTCTAT ATACATTTTA 2280 

GAATGAGGTT GGCAAGTTAA CAAACAGCTT TTTTGGGGTG AACATATTGA CTACAAATTT 2340 

ATGTGAAAAG AAAGTATACC TTCACAATAT TAAGTCTTTT AGTTCATGAA TATAGTATGT 2400 

CTCTCCGTTT CTGCATTAAC TTAGACATTC ATTAATTTCT CTCACAATTT ATAAGTTTAT 2460 

TTAGATCTTC ATTCATTTAA ATCTTCACTA ACCTCTCATT TACAATTTGT AAGTTTTCTG 2520 

GGTAACAGTC TTGCACTTCT TTGCCTAGAT TTATTTCCAA GTAGATTATT TTCATACATC 2580 

GTCTATGGTG TCATTTTTAA AATGTAATTT TTCACCTTTT TATTGCTAAA GAGAGATGAC 2640 

TGATTGTTAA TATTGATCTT GTGCGTGGCG ACCTTGCTGA ATTCTAATCG TTTATCTATA 2700 

AATTCTTTTG TATTTTGAAT GTAAACAATT AGATCATCTG CATATAATTT TTAAAATCTG 2760 
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CTTAAAATGA TGTATTTAAA AGGAAGAAAT TTTAACCCAT TCATAGGTGA GCTTCTGCCA 4680 

AGATTACTAC TAATCCTCAG GAGAAGGGGT AGAGGAGAAA CTCCATAAAG GCAACTGGAA 4740 

GTGGAGTATT AGGAAGCACC TCAAGAACAC AATAGCAGGA AGTAGCTAGA GAACAAAGAG 4800 

AAGAAAACCA GAAAAAAAAA ATCCCTTTTT ATTTTTCTGT TTCCATTCCT TTGGCTCCAT 4860 

TTCCACAGCT ATGGCCTTTA TTTTCACCCT CCACAGCCAT GAGAGCCTCT GGGCAGGAGT 4920 

TCTCCTCGCC TCTCCCTGTT CCAATCACCT CTAACATTTC TGCCTATTGT TCTGCCCAGG 4980 

GAAAAAACTC CAGTCTCTTC TCTGTCAAAG ACCTCTTGAA TTAAGTCCAA ATGCTACACT 5040 

CTGGCATTCA AGACTCGTAA TACAGCTCAA CCTGACTTTT CCACCCTCAG CCTCCTTGAT 5100 

TCCTAAAATG AAGCCTGTCC ACAATTGAAG CTCCTTGTCT TTGCTCCTGC AAATTTGTTC 5160 

ATTCTCCTGG CTGTGTTTGT GCTGGTCTCT GTCTATCTAG AGCTGTGGAT ATCATGGTAT 5220 

CTATTGTCTA TCATGCTAGC CATGAACCAC ATGTGGCTGG TGAGCATTTT ATATGGTACT 5280 

AGTCTAAATT GACATCTACT GTGAGTGTAA AAATGTGCAT TATGTTTTGA AGACTGTACA 5340 

CAAAATTTAA TTATCTCATG AATAATTTTA GATTGGTTAT ATGTTGAAAT TATAATATTT 5400 

TGGATATACT ATGCTAAATA AAACATATTA TTAAAATTAA CTTCACCTGT TTCTTTTCCT 5460 

CTTTCAATAT GGCTACTAGA GCTTTTTAAA TTGCATTATG TGACTTTATT GGACAGTACC 5520 

GATTGAATGC CCTCAACCAC ATCACCTCAC CACAGCCACC TCTACCTGTA GTGATCATAC 5580 

CACTTCTTTA GGCACACTGC CTGCATTAAG GGCAATGAAT GCCTTTTCAT CTTCTCCACT 5640 

AGATGTAGTT TCTTTTTTCT TTGAGAGCCA TCATCACCAT CATGGTTGAC ACCATGAACC 5700 

TATCTGAAGA TGTCAGCCAT AGACTGCTTG ATATTCTACA GGAAAGATCA CAGTTTTAAG 5760 

TGCAATCTAC CCATGTTATT AGCAGTGTGT ATCTTTCACA CATTACACAG CCTCTCTAAG 5820 

CCTCATTTCT CTCCTCTGTA AGATGGGGAT GATAATAACC CATCTCAAAT GTTTACTATG 5880 

AGGATTATTC AAAGAATGGC AAATAGCAAG TGCTTAATAA ATGATAACTA GTACTACCGC 5940 

CACTACTGTT GTTTTTATTG TATTAGATTA TGAACTCTCT AAGGACCATT TCCGGATGGA 6000 

GGATAAGAGA CCATTTGATG TGGGCAGTGA TGAGGCCTTC TGTTGCACCT GGAAAGGTCA 6060 

ACTATATACA AGCCTGCAAG TCATTCTATA GGAGCAGGCC CCAGTGACCA GACTCTATAG 6120 

ACTGTCTCCT CTTTCCTGAG AGGGACAGCC ATCTCTAGGT TGACTAACCT CTGAAGCTCC 6180 

TTGCATTGGC TTTTGTGCTA TGAGCCATGG ATGATTCCAG ACTAATCCGA GAATGCTCGT 6240 

CAAAACCCCA AGGAATTACT CAAATACTGA CATAACAGAC ATTTTTGAGT GGAAGAGCCG 6300 

AGTTTTTTTT AATATTCTGA AACTCATTGT TTTTAAAATG CATGAGATGG CCAAGGTCTT 6360 

GCTAAGAGCT GGCCTGCAAA GCGAAAGGCA GAGAGAATGA AACCCATAGA GAGGCAGAAT 6420 

AACCAGAAAG GTTGGGACTC GTTTATTTTA TAATGTAAAT TAGTCTATTA TGAAACAATA 6480 
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CTTGTTTACT GGTGGAAAAT TGGAAAATAC AAAGAATAAA AGGAGGAAAA AAATCACTCT 6540 

TTAGTTTCAC AAGCCAAATC AAGCCACTAT TAAAATGGTG GTTTACTTCC TTTTATTAAT 6600 

TTTCTGTACA TATTTTTGCA TAATCATGTT GTATGTACAA TTTTATGTTC TATTTTT CAA 6660 

TATTAACTGG TGTCTTTCAA ATTTCCTAAT GACAAAAATA ATATATGCTC ATAATAGAAC 6720 

ATTTTAAATG CAAATAAAAC AAAATAAATG TTAAAATTTA GTAATATTTA TTAAATTTTC 6780 

TCCAAGTGCA CGAAATTACA AATGTAACAA CCTAATTCCC TAGTGGCCTA ATAACCCTAT 6840 

TTCCAGACCT CTTCTCATTA CAAGGAAAAA CTCATATGCA GATAGTTCTA AAGGTATGAA 6900 

GTGAAAAGAT AAAGATTTTT CTTCCTTGCT GCATCCTCAC CCCATCAGCA TTATTCCCCA 6960 

GGGTAACTAC TATTAATAGA TAGTAATTCT ACCCAAAGGA AAAAATCATA TGCATATAAC 7020 

AGCATCATAT GTATACCTTT CTAGTAACTT ACAAAACAAA TGATAATATC ATATCCTTTC 7080 

TTATGTGTAT TGCTCTTTTC ACTAAATGTA TCTGTGATAT GTGTCTATAT CAGCTGATTG 7140 

TCCTTTTTGA TGGCTGAATA ATATTCCATC TTGTCCACGT GATAGTATTA CTTGACAAGC 7200 

TCCCTGCTGA TGGACATTTG TCTTTGTTAC TATGATAGTA ATATAATCAA CATTTATATA 7260 

TGTTTTGTAT GTATCTATAA TACACATGCA CATACACATG CATATTTCTG CAGGGATAGC 7320 

CATAGTAAAT AACTAGTAAC GGTATTGCAA GTTAAAGGAA CAATCTCATT GCTTGAAATT 7380 

TTAAATTTTG AAATACACTG CCAATTTTCA TGGTCTCTCC TTGTAAGCTA GTTTGGGCTT 7440 

TCTCACAGCA TGACAGGCTC AGGGCAGTCA GACCATCCTG GCCAAAGAGC AGAGTGCCAC 7500 

AGACCACAAC TGCTTCTAAT CAGCCATCTT CCCAAAGCCT TCTCTTTTTT CTATTAATAA 7560 

CTTTGTATGA GATTCCATCT TAATACTTTT CTGTTGTTTG GTCTTGTAAG AGCTTATTTT 7620 

TCTGAACCAG GAAGTGGTTC AGGGCGGTTT TTCTAACTTC ACAGAGCTCC CTCTTCTGTT 7680 

AGCTTTTGTG AAATGGTCAA AAACATAGCA GCCTGCCTTC TGAGTTCTCC ATCCCACCCT 7740 

GGTTGGGCCT TCTCTATCCT TGTCTGTGTT GTTTATATCC TGCTGAAGTG TGATTCCACT 7800 

TGTGCAGTTT CTCCTCTGTG TAGGATCAAA AGGGCTGTGG CTGGTTGGTT TGAAAATTTC 7860 

TTATACCCTA GACTATTCCA GTGCCTTTCA GAAGTTTCCA AGGCCCTCTC ACACTAATCT 7920 

ATTATCATAT TGGGCAAAAC TCCTTGCAGT TTCAGCTACT ATTCCCTGAT TGACTTTTCA 7980 

GTAAATCTAT CTCTCAGTCT TTCAGTATCC AAAGAAGATT GGTTCTAGGA CCACCATCCC 8040 

GCTGCCTCCA CAGATACCAA AATCAGAGGA TGCTCAATTC CCTCTTATAA AACGTTGCAG 8100 

TATTTGCATA TAATCTGCAC ATGTATTTCT GTATATTTTA AATCATCCCT AGATTACTTA 8160 

TAAXACCTGA TACAATATAA ATGCTAAATA GCTGTAACAC TGTATCTTTA AAAT TT AC AT 8220 

TATTTTTTGT TGTTGTATTA TTATTTTTAT TGTATTTTTA AAAAATATTT TCCATCTACA 8280 

GTCAGTAGAA TCCACGGATA CAGAACCTAT GGATAGGAAG GACCAACTGT ATCTTTTAGT 8340 
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GTTTTGAGGT TCTTG 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 1584 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



( ix ) FEATURE : 

(A) NAME/KEY: mat peptide 

(B) LOCATION: 357.. 917 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

AATTCTCAGG TCGTTTGCTT TCCTTTGCTT TCTCCCAAGT CTTGTTTTAC AATTTGCTTT 60 

AGTCATTCAC TGAAACTTTA AAAAACATTA GAAAACCTCA CAGTTTGTAA ATCTTTTTCC 120 

CTATTATATA TATCATAAGA TAGGAGCTTA AATAAAGAGT TTTAGAAACT ACTAAAATGT 180 

AAATGACATA GGAAAACTGA AAGGGAGAAG TGAAAGTGGG AAATTCCTCT GAATAGAGAG 240 

AGGACCATCT CATATAAATA GGCCATACCC ACGGAGAAAG GACATTCTAA CTGCAACCTT 300 

TCGAAGCCTT TGCTCTGGCA CAACAGGTAG TAGGCGACAC TGTTCGTGTT GTCAAC 356 

ATG ACC AAC AAG TGT CTC CTC CAA ATT GCT CTC CTG TTG TGC TTC TCC 404 
Met Thr Asn Lys Cys Leu Leu Gin He Ala Leu Leu Leu Cys Phe Ser 
15 10 15 

ACT ACA GCT CTT TCC ATG AGC TAC AAC TTG CTT GGA TTC CTA CAA AGA 452 
Thr Thr Ala Leu Ser Met Ser Tyr Asn Leu Leu Gly Phe Leu Gin Ara 
20 25 30 

AGC AGC AAT TTT CAG TGT CAG AAG CTC CTG TGG CAA TTG AAT GGG AGG 500 
Ser Ser Asn Phe Gin Cys Gin Lys Leu Leu Trp Gin Leu Asn Gly Ara 
35 40 45 

CTT GAA TAC TGC CTC AAG GAC AGG ATG AAC TTT GAC ATC CCT GAG GAG 548 
Leu Glu Tyr Cys Leu Lys Asp Arg Met Asn Phe Asp He Pro Glu Glu 
50 55 60 

ATT AAG CAG CTG CAG CAG TTC CAG AAG GAG GAC GCC GCA TTG ACC ATC 596 
lie Lys Gin Leu Gin Gin Phe Gin Lys Glu Asp Ala Ala Leu Thr He 
65 70 75 80 

TAT GAG ATG CTC CAG AAC ATC TTT GCT ATT TTC AGA CAA GAT TCA TCT 644 
Tyr Glu Met Leu Gin Asn He Phe Ala He Phe Arg Gin Asp Ser Ser 
85 go 95 

AGC ACT GGC TGG AAT GAG ACT ATT GTT GAG AAC CTC CTG GCT AAT GTC 692 
Ser Thr Gly Trp Asn Glu Thr He Val Glu Asn Leu Leu Ala Asn Val 
100 105 no 
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(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TGACATAGGA AAACTGAAAG G 



21 



(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26 
TTTGGATCCG TTGACAACAC GAACAGTGTC G 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27 
TTTCCCGGGA CATTGATTAT TGACTAGTT 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28 
CGTGTCAAGG ACGGTGACTG C 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 113 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Met Arg Gly Met Lys Leu Leu Gly Ala Leu Leu Ala Leu Ala Ala Leu 
1 5 10 15 

Leu Gin Gly Ala Val Ser Leu Lys lie Ala Ala Phe Asn lie Gin Thr 
20 25 30 

Phe Gly Glu Thr Lys Met Ser Asn Ala Thr Leu Val Ser Tyr lie Val 
35 40 45 

Gin He Leu Ser Arg Tyr Asp He Ala Leu Val Gin Glu Val Arg Asp 
50 55 60 

Ser His Leu Thr Ala Val Gly Lys Leu Leu Asp Asn Leu Asn Gin Asp 
65 70 75 80 

Ala Pro Asp Thr Tyr His Tyr Val Val Ser Glu Pro Leu Gly Arg Asn 
85 90 95 

Ser Tyr Lys Glu Arg Tyr Leu Phe Val Tyr Arg Pro Asp Gin Val Ser 
100 105 110 

Ala 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Met Thr Asn Lys Cys Leu Leu Gin He Ala Leu Leu Leu Cys Phe Ser 
1 5 10 15 

Thr Thr Ala Leu Ser Met Ser Tyr Asn Leu Leu Gly Phe Leu Gin Arg 
20 25 30 

Ser Ser Asn Phe Gin Cys Gin Lys Leu Leu Trp Gin Leu Asn Gly Arg 
35 40 45 

Leu Glu Tyr Cys Leu Lys Asp Arg Met Asn Phe Asp He Pro Glu Glu 
50 55 60 

He Lys Gin Leu Gin Gin Phe Gin Lys Glu Asp Ala Ala Leu Thr He 
65 70 75 80 

Tyr Glu Met Leu Gin Asn He Phe Ala He Phe Arg Gin Asp Ser Ser 
85 90 95 

Ser Thr Gly Trp Asn Glu Thr He Val Glu Asn Leu Leu Ala Asn Val 
100 105 110 

Tyr His Gin He Asn His Leu Lys Thr Val Leu Glu Glu Lys Leu Glu 
115 120 125 
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Lys Glu Asp Phe Thr Arg Gly Lys Leu Met Ser Ser Leu His Leu Lvs 
13° 135 140 

Arg Tyr Tyr Gly Arg lie Leu His Tyr Leu Lys Ala Lys Glu Tyr Ser 
145 150 155 160 

His Cys Ala Trp Thr He Val Arg Val Glu He Leu Arg Asn Phe Tyr 
165 170 175 

Phe He Asn Arg Leu Thr Gly Tyr Leu Arg Asn 
180 185 



